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WHAT IS CLAIMED IS: 

1 , A cDNA micro array data correction system fox 
correcting global and local distortions of microarray data 
5 more precisely and correcting measurement ^xxoxs caused by 
a difference in sensitivity between fluorescent dyes, 
comprising: 

an input device for inputting previously-adjusted 
gene expression intensity data, considering flag 
10 information indicating a removal of background noise and 
reliability of each spot; 

a data standardization means for standardizing the 
gene expression intensity data by using grid-by-grid order 
statistics for the input gene expression intensity data and 
15 for transmitting the standardized gene expression intensity 
data; 

first correction means for estimating a distortion 
depending on a spot position on grid coordinates for the 
standardized gene expression intensity data by a 
20 nonparametric smoothing method and for transmitting first 
corrected gene expression intensity data whose distortion 
has been corrected; and 

j second correction means for performing an S-D 

! 

j transformation for the first corrected gene expression 

25 intensity data, for estimating a potential distortion 
caused by a difference in sensitivity between the 

i 

fluorescent dyes in the gene expression intensity data by 
the nonparametric smoothing method, and for transmitting 
second corrected gene expression intensity data whose 
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distortion cans d by the difference in sensitivity between 
the fluorescent dyes has b en corrected; and 

an output device for outputting the second 
corrected gene expression intensity data. 

5 

2 • The cDNA microarray data correction system 
according to claim l f further comprising S-D transformation 
means for quantifying the distortion of the gene expression 
intensity data in an arbitrary stage and for visualizing it 

10 on an S-D plot. 

3 . The cDNA microarray data correction system 
according to claim 1 or 2 , wherein the order statistics are 
represented by the following EQ12 (where w^(c) is the 

15 standardized gene expression intensity data, yij(c) is gene 
expression intensity data of all spots obtained in a 
channel, and Lfc(c) and M* (c) indicate 25 and 50 percent 
points of the gene expression intensity data obtained in 
channel c in grid x , respectively) : 

w^c)^ ^_ , c«l,2 , i-1,- ,1, 3-1,-. J,k-1,---,K. 

20 (12) 

4 . The cDNA microarray data correction system 
according to claim 1 or 2, wherein the order statistics are 
represented by the following EQ13 (where w^(c) is the 
25 standardized gene expression intensity data, y^j(c) is gene 
expression intensity data of all spots obtained in a 
channel, and Ak(c), L k (c) and M k (c) indicate 35, 10 and 90 
percent points of the gene expression intensity data 
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obtained in channel e in grid 3c, respectively) : 

13 M k (c) - I, k <e) ' 

c - l,2,i = 1, » 1, — ,J,k = 1,",K. (13) 

5 5 . The cDNA microarray data correction system 

according to claim 3 or 4, wherein said data 
standardization means determines whether the gene 
expression intensity data of all spots obtained in at least 
two gene expression intensity data channels has been 
10 standardized and continues it until the gene expression 
intensity data of all spots has been standardized. 

6 . The cDNA microarray data correction system 
according to claim 1 , wherein the standardized gene 

15 expression intensity data is represented by a sum of a true 
gene intensity and a distortion depending on the spot 
position . 

7 . The cDNA microarray data correction system 
20 according to claim 1, wherein said first correction means 

describes the distortion depending on the spot position by 
means of a nonparametric regression model represented by a 
regression relation of distortions with an x-axis, a y-axis, 
and an interaction of the x- and y-axes ( a^(i) , pjf*fcj) , and 
25 Y)f*( - m i) (3 - m j))' *©spectively) and estimates the 

distortion depending on the spot position ($j^(c)) by the 
nonparametric smoothing method represented by the following 
EQ14: 
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15 



(14) 



8. The cDNA microarray data correction system 
according to claim 7 , wherein the distortion depending on 
5 the spot position is corrected according to the following 
EQ15 (where z^j(c) is corrected true gene expression 
intensity data) : 

S$ 3 «S = wjj(c) - ^ j( c) (i5) 

10 9. The cDNA microarray data correction system 

according to claim 8 , wherein the S-D transformation in 
said second correction means is performed according to the 
following EQ16: 

Ui d = S5 3 (l) + 2^(2) 



v?3« • fed) - 2j 3 (2) 



<16) 



10 . The cDNA microarray data correction system 
according to claim 9, wherein said second correction means 
describes the distortion by means of a nonpar ametric 
regression model represented by the following EQ17, 
2 0 estimates a measurement error caused by the difference in 

sensitivity between the fluorescent dyes by a nonpar ametric 
smoothing method represented by the following EQ18 and EQ19, 
and corrects the error: 

vjj = *<u£ j ) + e^j - N (o, v 2 ) (i7) 
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11 . The cDNA microarray data correction system 
5 according to claim 1, wherein , supposing that a probability 
of gene expression is lower than 0.5, it is assumed for the 
correction that the fluorescence intensity detected at more 
than half of the spots within each grid indicates a 
background noise or a systematic error. 

10 

12 „ The cDNA microarray data correction system 
according to claim 11, wherein, supposing that Ljc(c) and 
Mx(c) indicate 25 and 50 percent points of the fluorescence 
intensity obtained in at least two gene expression 

15 intensity data channels in a grid, it is further assumed 
for the correction that I*k(c) and M jc (c) - (c) are equal 
among the grids and teh channels on condition that most 
genes are in a non-expression state and that a distribution 
of 50 percent point or lower of the fluorescence intensity 

20 is common to all grids and channels* 

13. A cDNA microarray data correction method of 
correcting global and local distortions of microarray data 
more precisely and correcting measurement errors caused by 
25 a difference in sensitivity between fluorescent dyes, 



i : ■] i 



(18) 



(19) 
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comprising the steps of: 

inputting previously-adjusted gene expression 
intensity data, considering flag information indicating a 
removal of background noise and reliability of each spot; 
5 standardizing the gene expression intensity data 

by using grid-by-grid order statistics for the input gene 
expression intensity data on condition that most genes are 
in a non-expression state; 

outputting the standardized gene expression 
10 intensity data; 

estimating a distortion depending on the spot 
position on grid coordinates for the standardized gene 
expression intensity data by a nonparametric smoothing 
method and correcting the data distortion depending on the 
15 spot position; 

outputting the first corrected gene expression 
intensity data whose distortion depending on the spot 
position has been corrected; 

performing an S-D transformation for the first 
2 0 corrected gene expression intensity data, estimating a 

potential distortion caused by a difference in sensitivity 
between the fluorescent dyes in the gene expression 
intensity data by the nonparametric smoothing method, and 
correcting the distortion caused by the difference in 
25 sensitivity between the fluorescent dyes; and 

outputting the second corrected gene expression 
intensity data whose distortion caused by the difference in 
sensitivity between the fluorescent dyes has been corrected. 
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14, The cDNA microarray data correction method 
according to claim 13 , further comprising a step of 
quantifying the distortion of the gene expression intensity 
data in an arbitrary stage and visualizing it on an S-D 
5 plot . 

15- The cDNA microarray data correction method 
according to claim 13 or 14 , wherein the order statistics 
are represented by the following EQ20 (where w^(c) is the 
10 standardized gene expression intensity data, Yij(c) is gene 
expression intensity data of all spots obtained in a 

I channel, and L k (c) and M*(c) indicate 25 and 50 percent 

i 

points of the gene expression intensity data obtained in 
channel c in grid k, respectively) : 

w^c)-^ , c»i,2, i-i,-- i X,3-l,-",J,]c--l,-",X. 

15 M k (C) - LfcCO (20) 

16. The cDNA microarray data correction method 
according to claim 13 or 14, wherein the order statistics 
axe represented by the following EQ21 (where w^( c ) is the 
20 standardized gene expression intensity data, y£j(c) is gene 
expression intensity data of all spots obtained in a 
channel, and Ak(c), I*k(e) and M k (c) indicate 35, 10 and 90 
percent points of the gene expression intensity data 
obtained in channel c in grid k, respectively) : 

13 M k (c) - I, k <C) 

c = 1,2,1 = l,-,X,i = 1,-, J,k = 1,",K. (21) 
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17. The cDNA microarray data correction method 
according to claim 15 or 16, wherein, in the step of 
standardizing the data, it is determined whether the gene 
expression intensity data of all spots obtained in at least 

5 two gene expression intensity data channels have been 

standardized and it is continued until the gene expression 
intensity data of all spots have been standardized. 

18 . The cDNA microarray data correction method 
according to claim 17, wherein the standardized gene 

10 expression intensity data is represented by a sum of a true 
gene intensity and a distortion depending on the spot 
position. 

19. The cDNA microarray data correction method 
15 according to claim 13, wherein, in the step of correcting 

the data distortion depending on the spot position, the 
distortion depending on the spot position is described by 
means of a nonparametric regression model represented by a 
regression relation of distortions with an x-axis, a y-axis, 
20 and an interaction of the x- and y-axes ( ajf^i) , pjf^j) , and 
yjf^Ci - <D - nij)), respectively) and the distortion 
depending on the spot position (|£j(c>) is estimated by the 
nonparametric smoothing method represented by the following 
EQ22 : 

25 |^(c) = &<f>(i) + pj^fj) + y™Ui - m ± ) (j - m 3 >), 

c = 1,2, i = 1, — , I, j = 1, — f J. (22) 

20. The cDNA microarray data correction method 
according to claim 19, wherein the distortion depending on 
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the spot; position is corrected according to the following 
EQ23 (where z^(c) is corrected true gene expression 
intensity data) : 

5 

21 - The cPNA mi cr oar ray data correction method 
according to claim 19, wherein the S-D transformation in 
the step o£ correcting the distortion caused by the 
difference in sensitivity between fluorescent dyes is 
10 performed according to the following EQ24 : 

u^j- 2^(1) + 2^(2) 



(24) 



22 . The cDNA micro array data correction method 
according to claim 20, wherein, in the step of correcting 

15 the distortion caused by the difference in sensitivity 

between the fluorescent dyes, the distortion is described 
by means of a nonpar ame trie regression model represented by 
the following EQ25 , a measurement error caused by the 
difference in sensitivity between the fluorescent dyes is 

20 estimated by a nonpar ame trie smoothing method represented 

by the following EQ26 and EQ27, and the error is corrected: 



v? 3 - 4>^i 3 )+ e£ 3 ,6?-,- n(o,v 2 ) 



vj 3 - $(uj d ) 

(26) 



(25) 
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(27) 



23. The cDNA microarray data correction method 
according to claim 13, wherein, supposing that a 
5 probability o£ gene expression is lower than 0-5, it is 

assumed for the correction that the fluorescence intensity 
detected at more than half of the spots within each grid 
indicates a background noise or a systematic error. 

10 24. The cDNA microarray data correction method 

according to claim 23, wherein, supposing that (c) and 
Mjc(c) indicate 25 and 50 percent points of the fluorescence 
intensity obtained in at least two gene expression 
intensity data channels in a grid, it is further assumed 

15 for the correction that L k (c) and Mk(c) - Lk(c) axe equal 
among the grids and the channels on condition that most 
genes are in a non-expression state and that a distribution 
of 50 percent point or lower of the fluorescence intensity 
is common to all grids and channels . 

20 

25 . The cDNA microarray data correction method 
according to claim 23, wherein, denoting that A*{c) , l«k(c) 
and Mfc(c) indicate 35, 10 and 90 percent points of the 
fluorescence in a grid k for channel c, it is assumed for 
25 the correction that A k (c) and (c) — I* k (c) are common to all 
grids and channels . 



i . .5 i . itm i & mm 



03-1 0-30 ; 21 : i 2 ; ftg * »S&ft*8); 



FOLEY&LARDN E R (WASH ; 81 33503. 0 250 



# 32/ 47 



- 29 - 

26. A cDNA microarray data correction program 
for use in correcting global and local distortions of 
microarray data more precisely and correcting measurement 
errors caused by a difference in sensitivity between 
5 fluorescent dyes with a computer to execute the steps of: 

inputting previously- adjusted gene expression 
intensity data, considering flag information indicating a 
removal of background noise and reliability of each spot; 

standardizing the gene expression intensity data 
10 by using grid-by-grid order statistics for the input gene 
expression intensity data on condition that most genes are 
in a non-expression state; 

outputting the standardized gene expression 
intensity data; 

15 estimating a distortion depending on the spot 

position on grid coordinates for the standardized gene 
expression intensity data by a nonpar ame trie smoothing 
method and correcting the data distortion depending on the 
spot position; 

20 outputting the first corrected gene expression 

intensity data whose distortion depending on the spot 
position has been corrected; 

performing an S-D transformation for the first 
corrected gene expression intensity data, estimating a 

25 potential distortion caused by a difference in sensitivity 
between the fluorescent dyes in the gene expression 
intensity data by the nonpar ame trie smoothing method, and 
correcting the distortion caused by the difference in 
sensitivity between the fluorescent dyes; and 



I . .! I 
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output-ting the second corrected gene expression 
intensity data whose distortion caused by the difference in 
sensitivity between the fluorescent dyes has been corrected. 



5 27* A computer -readable memory medium 

containing a cDNA roicroarray data correction program for 
use in correcting global and local distortions of 
microarray data more precisely and correcting measurement 
> btxoxs caused by a difference in sensitivity between 

10 fluorescent dyes with a computer to execute the steps of: 
inputting previously-adjusted gene expression 
intensity data, considering flag information indicating a 
removal of background noise and reliability of each spot; 

standardizing the gene expression intensity data 
15 by using grid-by-grid order statistics for the input gene 
| expression intensity data on condition that most genes are 

in a non-expression state; 

outputting the standardized gene expression 
intensity data; 

20 estimating a distortion depending on the spot 

position on grid coordinates for the standardized gene 
expression intensity data by a nonparametric smoothing 
method and correcting the data distortion depending on the 
spot position; 

i 

! 25 outputting the first corrected gene expression 

J intensity data whose distortion depending on the spot 

i 

i position has been corrected; 

i 

; performing an S-D transformation for the first 

corrected gene expression intensity data, estimating a 
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potential distortion caused by a difference in sensitivity 
between the fluorescent dyes in the gene expression 
intensity data by the nonpar ame trie smoothing method, and 
correcting the distortion caused by the difference in 
5 sensitivity between the fluorescent dyes; and 

outputting the second corrected gene expression 
intensity data whose distortion caused by the difference in 
sensitivity between the fluorescent dyes has been corrected. 



